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FORMERLY WILLOW RUN LABORATORIES, THE UNIVERSITY OF MICHIGAN 

TASK 1 . 

MULTISEGMENT TRAINING 
(R. Kauth and W. Richardson) 

1.1 OBJECTIVE 

The objective of Task 1 is to develop a sampling strategy for 
selecting training data, applicable to proportion estimation over a 
wide region. The main requisites of that strategy are that it produces 
a representative sample and that the training sample size is small com- 
pared to the total area to be classified. 

1 . 2 APPROACH 

1. Create a conceptual basis for the problem of training in a 
large scale remote sensing system, incorporating the inputs 
from UCB, LARS, and other ERIM tasks, and consistent with 
LACIE operational constraints. 

2. Within this framework, propose a detailed methodology for 
training selection. 

3. Demonstrate the selection methodology in an intermediate scale 
exercise over a partition containing from 15 to 30 sample seg- 
ments from which 5 to 10 segments are selected for training 
and for which a wheat proportion estimate is made. 

4. Incorporate both multitemporal and across partition signature 
extension capability into the final procedure. 

5. Incorporate the capability to work with incomplete sets of 
multitemporal data and to optimize selection to make estimates 
at several times during the growing season. 
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1.3 PROGRESS 

j A baseline version of Procedure B was previously provided to Task 2 

i of this contract for testing. During the last quarter such tests were 
j carried out with necessary technical support provided by this task. 

: (See Task 2 discussion.) Effort on the missing acquisition capability 
' was temporarily suspended. Coding of the missing acquisition capability 

I 

i is about 90% complete. The problem of defining a composite procedure 
' combining desirable aspects of both Procedure B and JSC’s Procedure 1 
' was investigated resulting in recommendations to make key modifications 
i in P-1. These were included in the SR&T quarterly review. Sept. 11-14, 1978 

The major effort during the quarter was in the exercising of diagnos- 
tic tools and procedures to measure the performance of certain components 
of Procedure B not previously examined in detail. Note that the tests 
on the baseline version of Procedure B under Task 2 are tests of global 
performance compared to other approaches. The component performance 
tests carried out under this task are for the purpose of identifying 
and isolating the sources of variance and bias in Procedure B and of 
establishing optimal parameter settings. 

1.4 TECHNICAL DISCUSSION 

The technical discussion will be limited to two topics, namely 
recommendations for immediate modifications to P-1 improve efficiency, 
and discussion of component tests of the spatial stratification portion 
of Procedure B. Both of these topics x^ere presented at the quarterly 
reviex^7 September 11, 1978. 

The three performance measures for machine processing x^hich are 
of greatest current interest are bias, variance, and analyst support. 

P-1 and Procedure B are designed to reduce bias with respect to the 
source of labels and in this they are successful. It appears to be 
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worth while to trade some bias for other performance measures, such as 
; reduced variance (i.e., greater efficiency) or analyst convenience or 
I accuracy. Analyst accuracy and convenience may be substantially 
; improved by asking the analyst to label only reasonable features in 
i the scene, as Indicated by blob interiors. This is discussed further 
in section 1.4.2., and in the Task 2 discussion. 

' Efficiency can be gained in part by a better stratification of the 

segment prior to bias correction. This is discussed in more detail in 
: the following section 1.4.1. 



1.4.1 RECOMMENDATIONS FOR P-1 MODIFICATIONS 

The so called bias correction in P-1 can be thought of as a regres- 
sion estimator or as a stratified sample estimate. In P-1 the spectral 
—data is formed into two strata, a nominal wheat class and a nominal 
non-wheat class, using about half of the labelled samples (the Type 1 
dots) to assist in forming the two strata from 40 raw unsupervised 
clusters. Other labelled samples (the Type II dots) which fall into 
these strata are then used to make a final estimate of. the wheat in 
the scene. 
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N is the total number of pixels 

N is the number of pixels classified as nominal wheat 
w 

n = n + n is the total number of type II dots 
w — 
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n is the number of wheat dots found in the nominal wheat class 
ww 

, n is the number of wheat dots found in the nominal non-wheat class 

; WW 

; Since the so called classes are only nominal this expression may 

be generalized to multiple strata, as in Procedure B. (In Procedure-B 
the sampling is with blobs instead of pixels, but that is not the 
; important point in this discussion.) Then the estimate is 
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(This proportional allocation automatically occurs with suffi- 
cient accuracy in P-1 because there are only two strata. In Procedure-B 
the samples must be directed to the strata in order to approximate 
proportional sampling.) 

Under this approximation the variance becomes 



In order to obtain a measure of efficiency we compare the variance 
in stratified sampling to the variance in unstratified sampling, namely 


— P (1-P) where P is the overall true proportion of wheat in the scene. 

The result is a ratio, the variance reduction factor, which is 
independent of the number of Type II dots drawn. 



P (1-P) 


Now it has been shown by Cochran [1] and independently by 
Holsztynski [2] that R.V. stays the same or increases when the number 
of strata is decreased by combining existing strata. Thus an indication 
of theory would be that the strata in P-1 should not be combined but 
instead should be sampled from proportionally. 

In order to test this concept a group of 5 segments were examined. 
P-1 clustering (ISOCLASS) had been performed upon these segments and 
the cluster statistics were provided to us by G. Bahdwahr of JSC. The 
statistics included the number of pixels in each of 40 clusters, the 
Type I dot label assigned to the cluster and the true proportion of 
small grains in each. Using this information the R.V. for three 
different procedures can be calculated, 'P-1', 'P-Smart' and 'P-B' . 
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'P-1' consists of combining the clusters into two strata based on the 
Type I dot label. ’P-Smart' consists of combining the clusters into 
: two strata based on the true proportion of small grains in the cluster. 

: 'P-B' consists of leaving all the clusters separated and assuming 
that the samples are allocated proportional to the cluster size. 

Figure 1.1 shows the results of this comparison for the 5 segments, 
i The entries in the figure are the variance reduction factor, R.V. (Recall 

j 

i that a small value of R.V. is good, being a measure of the number of 

I samples which must be labelled to achieve some stated variance of the 

I 

i estimate.) The table shows that 'P-smart' has a lower R.V. by about 
; .2, which indicates that efficiency is lost in combining strata. 'P-B' 
averages about .1 lower R.V. than 'P-smart'. 

Based on this example our recommendations, presented as the 
quarterly review, are as shown in Figure 1.2. Type I dots should not 
be used to group clusters, instead they should be used directly to 
make the segment proportion estimates. It appears that a factor of 
about 3 might be saved in labelling effort by this step. 

1.4.2. DISCUSSION OF BLOB PARAMETER SETTING AND PERFORMANCE TESTS 

(The following discussion is substantially the text of appropriate 
portions of the quarterly review presentation of Wyman Richardson.) 

Figure 1.3 

VJe have been conducting experiments on 13 Kansas segments to 
measure the performance of components of Procedure B. 

Further testing of Procedure B on N. Dakota segments, have been 
carried out under Task 2 of this contract. 
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Figue 1.4 

The components of Procedure B we have been testing are 

The grouping of the data into spatial-spectral clusters 
(blobbing) 

The grouping of blobs into B clusters 
Selection of training segments 
Selection of training blobs 
The method of proportion estimation 

Tests on the last four components were reported on last time. We now 
report tests on the first component. These tests have been made 
possible by the provision of exact ground truth. 

Figure 1.5 

To help in the understanding of the blob test, let me first 
describe the blob algorithm. 

Blob is a way of grouping the, pixels of a scene into clusters that 
are spatially and spectrally homogenous. The Blob algorithm is to add 
two more channels, line number and point number, to the multispectral 
data channels. Line and point are the spatial variables. A standard 
clustering algorithm is then used to add pixels to existing clusters 
and to creat new clusters when the pixel is not close enough to any 
existing cluster. 

The distance function used in the clustering is this: each 

channel of the pixel is in turn subtracted from the corresponding 
channel of the candidate blob mean, this difference is squared, 
divided by a weight which we'll call a variance, and finally, all 
such terms are added up. Notice that the last 2 terms are spatial 
and the first ones are spectral. 
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The pixel is added to the existing blob with the smallest 
distance unless this minimum distance is less than a parameter TAU, 
in which case the pixel becomes the seed point of a new cluster. 

Ih'our task, we have used 6 spectral variables, brightness and 
greenness from 3 biophases. 

Figure 1.6 

In our study of Procedure B components, we are faced with the 
problem of specifying the parameters of the Blob algorithm, namely 
Var 1 through Var 6, VARL. VARP and TAU. 

The spectral weights were set by referring to our previous work 
on finding the optimal spectral weights for grouping blobs into B 
clusters. A search pattern in 6 dimensional space convinced us that 
for B clustering, the best spectral weights, expressed as variances, 
were proportional to the effective ranges of the variables. Using 
the same proportion for blobbing weights, we have determined the 
spectral weights relative to each other. 

We can set the proportion of the line variance VARL to the point 
variance VARP so that the line s.d. represents the same geographic 
distance as the point s.d. This proportion is not 1 to 1 because 
points are sampled more frequently than lines. 

The next step is to determine the balance between the set of 
spectral variances and the pair of spatial variances. This was 
accomplished by holding the spectral variances constant and comparing 
the Blob reduction of variances for 3 sets of spatial variances, small, 
middle and large. (You understand that if all the parameters are 
increased by the same proportion, the algorithm remains exactly the 
same. That is why holding the spectral variances constant and varying 
the spatial variances and TAU is permissible.) 
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Figure 1.7 

The blob reduction of variance factor that is the arbiter for 
determining the balance between the spectral and spatial weights is 
the same expression as before- The denominator is p times 1-p, where 
p is the overall wheat proportion in the segment. The numerator is 
an average of similar expressions, times where is the 

proportion of wheat in blob i. 

^-Then a blob is pure wheat or pure other, either P^ or 1~F^ is 
zero and the term vanishes. Thus the Blob R.V. is a measure of the 
purity of the blobs: the purer the blobs, the smaller the Blob R.V. 

It's like gold, a loi^ score is a good score. Perfect purity produces 
a zero score; all blobs equally mixed produces a score of 1. 

Figure 1.8 

Here are the results of computing the blob reduction of variance 
for 8 segments and 3 settings of the spatial variances. The small 
setting corresponds to more emphasis on the spatial, the large setting 
to more emphasis on the spectral. 

The general result is theat Blob R.V. is quite stable for the 
three settings. The vertical scale has been greatly stretched to show 
any trend in the curves. But since we do need to decide on a setting, 
we might as well use one that these gentle trends indicate is optimal. 

The worst case, segment 10, has a minimum in the middle. But most 
of the segments indicate a minimum at a lotver setting. You will notice 
that the average trend, represented by the dotted line goes down as you 
go from the upper to the middle setting. A setting of 3.46 and 6.0 
midway between the 2 lower settings, was chosen. 

The parameter TAU that determines whether a pixel should start 
a new blob, was chosen so that the number of big blobs would roughly 
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correspond to the number of fields. TAU was kept the same for all 
segments, rather than keeping the number of blobs constant, so that 
segments with large or small fields would not be forced Into a 
uncomfortable mold. 

TAU was determined In this way for the middle spatial setting. 

For the upper and lower settings, value of TAU was found to keep the 
number of blobs constant. You understand that when you decrease TAU, 
you Increase the number of blobs and vice versa. And as a general 
rule when the number of blobs increases, the reduction of variance 
factor goes down. So to make the three spatial settings comparable, 
the number of blobs was held constant within each segment while the 
number of blobs was allowed to vary from segment to segment. 

Figure 1.9 

I've just mentioned the term big blob. By a big blob, I mean one 
that has interior pixels. An interior pixel is one that is surrounded 
on all four sides, up, down, left and right, by pixels in the blob. 

A pixel in a big blob that is not interior is called an edge pixel. 

A blob with not interior pixels is call a small blob. 

Figure 1.10 

Here is a gray map of the blobs in Segment 1865. The big blobs 
are represented by printed characters and the small blobs by. blanks. 
Although there are many small blobs, they don't contain many pixels. 
They mostly represent boundaries between fields. 

Figure 1.11 

The parameter settings chosen produced blobs with these character 
istics: 
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Segment 1 is good for comparing the number of blobs with the 
number of fields because it has very few small fields. Some of the 
other segments have many fields that one would not expect to correspond 
to big blobs. 

Note that the number of big blobs in a segment is remarkably 
constant. 

Although the small blobs are numerous, they don’t represent many 
pixels. They average 19% of the pixels. 

About half the pixels are edge pixels in big blobs. 

The interior pixels average 30% of the total. 

Figure 1.12 

This table shows the performance of the operations of blobbing 
and B clustering as measured by the R.V. factor. Remember that the 
R.V. factor ranges from 0 to 1, 0 representing perfect stratification 
and 1, worthless stratification. 

The table also shows the lower limit of purity imposed by the 
number of mixed pixels. A "mixed pixel" here refers to a mixture of 
wheat and non-wheat. If other crop mixtures were counted, a much 
higher number of mixed pixels would be observed. 

The mixed pixel R.V. measuring the average purity of the pixels 
is analogous to the blob and B cluster R.V. 

The blob R.V. is given in the "all” column and the B cluster 
R.V. in the right hand column. 

It's not a coincidence that in every case, the pixel R.V < the 
blob R.V. < the B cluster R.V. A member of our staff has proved that 
whenever you combine groups into larger groups, the average purity, as 
measured by the R.V., can never get better, and only in rare 
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coincidences can it stay the same. 


Thus the pixel R.V. is a limitation on how good the blob R.V. 
can be and similarly, the blob R.V. is a limitation on how good the 
B cluster R.V. can be. 


j The R.V. score for blob interiors is very good, showing that the 

blob operation is doing its job in the sense that although there may 

I 

j be some confusion at the edges of the blobs , the interiors of the blobs 

I are quite pure. 

i 

i The average R.V. for B clusters, .58, can be interpreted like this 

i If we were to sample the pixels of each B cluster at random in such a 
I way that the number sampled from each B cluster were proportional to 
the size of the cluster, and form the stratified estimate of the per 
j — cent wheat in the segment, then that estimate would have a certain 
variance. We compare that variance with the variance of an estimate 
obtained by sampling at random from one pool of all the pixels in the 
segment and we find that the stratified estimate variance is only 58% 
of the overall estimate variance. Since variance is inversely 
proportioned to sample size, it follows that the stratified estimate 
would require only 58% of the Identifications needed by the overall 
estimate to achieve the same variance. 

This column is a misleading indication of the value of Procedure B 
because Procedure B samples blobs rather than pixels. This would be 
expected to reduce the variance because the proportion of wheat in a 
blob randomly chosen from a B cluster would be expected to have less 
variance than the proportion of wheat in a pixel randomly chosen from 
a B Cluster. 

The purity of blob interiors offers some hope of solving a 
practical difficulty with sampling. Our wheat estimates are all based 
on samples provided by the judgments of analyst interpreters. If these 
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AI's are asked to identify a pixel at random, the chance is 70% that 
it will come from an edge or a small blob and is therefore likely to 
be on or close to a field boundary. What with multitemporal registration 
errors and mixed spectral responses, it would seem a formidable, if 
not impossible task to identify such a pixel. But if asked to identify 
a relatively pure blob interior, the AI would have a much better chance 
to respond accurately. 

Figure 1.13 

This table of different methods of estimating the wheat % gives 
empirical information about the accuracy of such a procedure. 

The column on the left is the % of wheat in the scene computed 
from every pixel. The figures in this column and indeed all of the 
figures I have displayed except for the column on the right, are based 
on the ground truth data recently computed at ERIM. The column on the 
right is a measurement of the % wheat in the scene made by planinetry 
at JSC a couple of years ago. The average absolute difference between 
the right hand column and the left column is 8 tenths of 1%, showing 
that even the most careful measurements from high resolution photography 
are subject to an error of about 1%. I'm leaving out of the calculation 
the discrepancy in segment 27 which is caused by the failure of the 
photography to cover the top quarter of the segment. 

The next four columns are the 1% wheat computed on various subsets 
of pixels. The big blob pixel and the edge pixel estimates are quite 
close to the measured truth, while the small blob pixel and the interior 
pixel estimates are erratic. 

The estimate on small blob pixels has a bias that averages 4% and 
ranges from -8% to 17%. The bias in the estimate from the small blob 
pixels has to be balanced by a bias in the opposite direction on the 
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big blob pixels. The big blob bias is smaller because there are more 
pixels in the big blobs. The estimate on interior pixels has a bias 
that averages -3% and ranges from -6.5% to 2.5%. 

The interior pixel estimate was computed by simply totalling the 
% wheat of the interior pixels and dividing by the number of the 
interior pixels . 

Another estimate based on interiors is to assign to all pixels 
in a big blob the proportion of wheat found in the blob interior. This 
is the "extrapolated from interior" column. This estimate is less 
' biased and less erratic than the simple-minded interior pixel estimate. 
In fact, its bias and average absolute error is close to the error 
between the 2 planimetry measurements. 

This estimate, moreover, is more achievable by the AI because it 
is based only on interiors of blobs. 

A more realistic estimate yet is obtained on the assumption that 
on the relatively pure interiors, the AI will identify either 100% 
wheat or 0%. The "extrapolation from pure interiors" is such an 
estimate. Either 100% or 0% is extrapolated to all the pixels of the 
blob and then the % wheat of all pixels is obtained. 

This estimate also gives very good results and has the advantage 
of representing a practical sampling procedure. 

1 . 5 RECOMMENDATIONS 

An experiment to determine the ability of LACIE analysts to 
accurately identify the interiors of designated blobs, as opposed to 
their ability to identify P-1 Dots, should be conducted. 

The use of a more sophisticated clustering algorithm which might 
produce better variance reduction with fewer clusters should be 
investigated. 



1-14 



FORMERUY WILLOW RUN LABORATORIES. THE UNIVERSITY OF MICHIGAN 

At least two aspects of Procedure B should be implemented in a 
P-1 environment. 

Multiple strata (No Type 1 Dots) 

Dots to be labelled assigned by stratified random sample 

Further ahead in time, a blob capability could be added to P-^1 
to assure the purity of the dots assigned to' be labelled. 

1.6 PLANS 

During the final quarter we plan to carry out end to end 
evaluation of Procedure B in Kansas, similar to Task 2 evaluation of 
N. Dakota. This will be carried out jointly with Task 2. These tests 
will be carried out for local and multisegment Procedure B and for 2 
biowindows as well as 3 biowindows. 
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FIGURE 1.1. VARIANCE REDUCTION FACTOR FOR 5 SEGMENTS USING 
PROCEDURE-1 CLUTERING AND SEVERAL GROUPING RULES 
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FIGURE 1.3 
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FIGURE lA 

COMPONENTS BEING EXAMINED 
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Task 1 



BLOB REDUCTION OF VARIANCE FACTOR 
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FIGURE 1.8 

BLOB REDUCTION OF VARIANCE FACTOR 
FOR 3 SETS OF SPATIAL WEIGHTS 
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FIGURE 1.9 
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FIGURE 1.12 

REDUCTION OF VARIANCE FACTOR 
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VARIOUS ESTIMATES OF WHEAT PROPORTION 
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TASK 2 

evaluation of partitions for signature extension 

^R. C. Cicone)* 


2.1 INTRODUCTION 

The sampling and classification strategy of the Large Area Crop 
Inventory Experiment (LACIE) entails employing local signature 
training to determine wheat proportion estimates within 5x6 mile 
sample segments. These estimates are then aggregated over areas 
of interest. Multisegment signature extension is philosophically 
founded on the premise that representative training information can be 
determined using non-local procedures at an additional savings in cost 
and potential reduction in the variance of the estimate. 

Task 2 is concerned with addressing the key issues found in 
Table 2.1 that pertain to non-local training techniques. 

2.2 OBJECTIVE 

The objective of Task 2 is to test and evaluate techniques and 
procedures which embody the signature extension approach to large 
area crop inventories using Landsat data. 

2 . 3 APPROACH 

The approach adopted to address the objective of Task 2 is 
outlined in Table 2.2 

2.4 SUMMARY OF PROGRESS DURING THIRD QUARTER 

Table 2.3 reviews progress and observations made during the First 
Two Quarters of this contract. Progress during the Third Quarter is 
outlined in Table 2.4. 

*T . Wessling and 0. Mykolenko contributed to work reported. 
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2.5 DETAILS OF PROGRESS 

Efforts this quarter have concentrated on an evaluation of 
Procedure B conducted in a local mode. The objectives of this eval- 
uation are detailed in Table 2.5. In addition to examining the 
performance of Procedure B in a local mode to provide a basis for 
comparison in multisegment testing, it was of great interest to 
examine the performance of this procedure under conditions very 
different from those under which it was originally developed and 
tested. Overall results were accurate and the Procedure behaved, in 
terms of bias and variance, in a predicted manner within performance 
expected of Procedure 1. 

Table 2.6 outlines the general characteristics of Procedure B 
where as Table 2.7 provides information with regard to the procedures 
specific implementation. Procedure B differs from Procedure 1 in a 
variety of ways. Important differences include: 

1. Modular construction [Procedure B can be easily adapted 
to advances in processing techniques] 

2. data normalization 

3. use of spatial features 

4. use of multiple strata 

5. use of sampling directed randomly to the strata proportional 
to their size 

6. field interiors labelled rather than dots 

7. labelled fields samples used for bias correction only 
(i.e,, none are used for signature definition) 

The specific experiment conducted is outlined on Table 2.8 using 
the data set illustrated in Figure 2.1. Note that the procedure is 
evaluated in North Dakota, a state with different crops and smaller 
fields than those encountered in Kansas where previous testing has 
been conducted. 
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Figures 2.2 to 2.5 are provided to illustrate field specification 
and labelling aspects of the procedure. Figure 2.2 is a map of field 
shapes in segment 1899 as determined by the BLOB program. Figure 2.3 
indicates sixty fields to be labelled. The procedure considers only 
fields with at least one interior pixel. These interiors are 
illustrated on Figure 2.5. Figure 2.4 indicates the subset of sixty 
field Interiors that would be presented to the AI for labelling. 

Ground truth as provided by JSC in sub-pixel format was used for 
labelling. Hence, bias and variance due to analyst labelling is not a 
factor in this experiment. Two labelling criterion are evaluated 
(Figure 2.6). Proportional labelling requires that the field be 
labelled by the proportion of the dominant crop present, as opposed 
to grain/non-grain labelling in which the field is labelled by the 
dominant crop. The latter is considered more practicable for the AI. 
Its success is on how well BLOB field definition map the real fields 
present. 

The series of tables and figures that follow indicate various 
performance measures used to evaluate the procedure. Much attention 
was placed on evaluating how well field shapes determined by BLOB 
correspond to actual field* shapes. 

Table 2.9 provides statistics on a segment by segment basis 
describing the characteristics of the field shapes (blobs) determined. 
The definitions of 'big blobs', 'little blobs'. etc. are consistent with 
those in Task 1. The 'interior variance reduction factor '(R) is of 
particular importance. This factor measures the relative purity of 
blob interiors. Low values indicate purer fields. If R=0, each pixel 
within the blob is from the same crop class, if R=1 then the field is a 
50-50 mixture, and if R=.5 the field is about an 85-15 mixture. The 


*A field is defined as a spatially contiguous area of like crops, 
as opposed to an ownership boundary. 





highest average value of .352 in segments 1903 has been found to be 
suspect due to the misregistration of the ground truth. Figure 2.7 
illustrates clearly that blob interiors are, in general, pure classes. 
Over 60% of big blobs are 90% or better pure non-grain, over 30% are 
90% or better pure grain, and less than 10% are somewhere in between. 

Since Procedure B samples only non-empty fields (big blobs), a 
source of bias with respect to a given crop in the procedure is the set 
of small blobs not sampled. The bias is estimated by the expression: 



where 

P is the crop proportion in the pixels from which samples were 

S ■ 

drawn 

. P^ is the crop proportion in the pixels from which no samples were 
drawn 

N is the total number of pixels 

M is the total number of pixels from the sampled strata 

If P =P or if ^fvN, no significant bias is introduced. Figure 
s u 

2.8 displays on a segment by segment basis (ordered by small grain 

content) the relationship of P^ and P^. Note that in segments with 

little grain, P >P_^; whereas as grain content increases, P >P . This 
us s u, 

trend is being considered as a potential mechanism for bias correction 
at the segment level. 

The next step in the procedure involves stratifying the ’big 
blobs' into 'b-clusters . ' This step is carried out to reduce the 
variance of the overall estimate. The strata purity determines the 
extent to which the overall variance can be reduced. Figure 2.9 uses 
the variance reduction factor to illustrate the relative improvement 
that can be expected for 20, 40 and 60 strata versus the unstratified 
case. Note that b cluster purity is limited to within the purity of 
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the blobs that make up the b-clusters. There appears a general trend 
to purer strata as the amount of grain in a scene increases. The 
underlying cause is that the segments with more grain also have larger 
fields which tend to result in purer blob field definitions. 

Turning to overall proportion estimates. Table 2.10 and Figure 2.10 
are presented for reference. These represent the variance for 
unstratified sampling and bias on a segment basis that would be expected 
characteristics of Procedure B. Using proportional labelling. 

Figure 2.11 indicates the procedure performed very much along the lines 
expected in terms of bias due to not sampling small fields (Figure 2.10). 
Table 2.11 summarizes results for all eleven segments. Note that the 
measured standard deviation for the unstratified case (1 b-cluster) is 
comparable to that expected (Table 2.10). On the average the procedure was 
biased by about 1% where 31% was the measured grain ground truth proportion 

The ratio; 



where 

O: standard deviation 

i: the ir of strata used 

j: the # of fields sampled 

is the measured variance reduction factor (R) . In most cases’ R'^.S 
indicating an advantage in stratifying. Both the accuracy and R factor 
deteriorate if the number of samples is « the number of strata. This 
is a result of: (1) missing certain strata in sampling and (2) not 

being able to sample proportionally to the size of the strata. Other 
computations not shown here indicated the procedure to be unbiased with 
respect to the big blob ground truth (i.e., the truth ignoring the 
unsampled small field strata). 
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FORMERLY WILLOW RUN LABORATORIES. THE UNIVERSITY OE MICHIGAN 


It was of great interest to compare the result of the second 
labelling criterion (grain/non-grain) to the proportional labelling 
approach since this labelling technique seemed more achievable using 
analyst interpreters. Figure 2.12 illustrates the bias measured on 
a per segment basis and compares favorably with both the expected bias 
(Figure 2.10) and that measured using proportional labelling (Figure 2.11) 
This is largely due to the relatively pure character of the blob 
interiors sampled.. Table 2.12 is the counterpart to Table 2.11. We 
find the procedure to be largely unbiased and behaving, in terms of 
variance, as expected. This approach also proved to be unbiased with 
respect to the grain estimate based on big blobs only. 

2.6 CONCLUSIONS AND RECOMMENDATIONS 

Table 2.13 contains conclusions drawn based on the analysis of 
local Procedure B performance in North Dakota. Recommendations are 
shared with and presented in Task 1. 

2 . 7 PLANS 

Table 2.14 lists activities recommended activities for future work. 
Activities for the remainder of the contract year include the completion 
of evaluation of multisegment techniques. 
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SEGMENT 1899 FIELD INTERIOR MAP 
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TABLE 2.8 

EVALUATION OF LOCAL PROCEDURE B 
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TABLE 2.9 

NORTH DAKOTA FIELD STATISTICS 
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PERFORMANCE OF LOCAL PROCEDURE B 

(Proportional Labelling Of Fields) 
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Difference Determined By ESTIMATE-TRUE Where TRUE Is The Percent Of All 

. Small Grains In LEG Ground Truth 



TABLE 2.12. ' PERFORMANCE OF LOCAL PROCEDURE B 

(Grain/Non Grain Labelling Of Fields) 
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TASK 4 

SPECTRAL SEPARABILITY OF SPRING IJHEAT FROM OTHER SMALL GRAINS 
(W.A. MALILA and E.P. CRIST) 

4.1 INTRODUCTION 

The problem of distinguishing between spring wheat and other small 
grains is of interest to LACIE and similar large-area crop inventory 
activities. Prior studies at ERIM (See Table 4.1) gave indications of 
separability of spring wheat and barley under certain conditions in 
Landsat data from several Phase 2 Blind Sites in North Dakota. Last 
quarter, we expanded our analysis to include Phase 3 Blind Sites, with 
comparable results — unitemporal correct classification given complete 
training in the 80% range on individual segments and 76% for a seven- 
segment case. Spring wheat and barley were most separable in the growth 
stage when they are turning color and ripening. Thus, accurate crop 
calendar estimates appear to be important for discrimination. Work 
presented at the last quarterly S.R.&T. review by Dr. Gautam Badhwar of 
NASA/JSC also indicated the importance of crop calendar estimates for 
individual fields within a segment. 

The objective of this task is to develop a spectral classification 
procedure for discriminating among the spring small grains using Landsat 
data. Oats were found to be much more difficult than barley to spectrally 
distinguish from spring wheat, so major emphasis has been placed on 
spring wheat vs. barley separation. 

4.2 SUMMARY OF PROGRESS DURING THE THIRD QUARTER 

During the reporting period, a general procedure for discrimination 
was outlined and specific steps were selected in some cases and several 
alternatives were explored in others. We explored the use of the time 
profile of green development as a basis for estimating the shifts in 
crop calendar between individual fields or pixels. A mathematical 
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model form was selected to serve as a reference profile in the estimation 
procedure, together with a cross-correlation calculation to determine 
time shifts. Three alternative decision procedures for using shift 
calculations were defined and evaluation was begun using data from 
Segment 1663, primarily, and from two other segments. 

4.3 DETAILED DISCUSSION OF PROGRESS 

A general procedure for discrimination between spring wheat and 
barley is outlined in Table 4.2. The preprocessing steps of data 
screening and haze correction would incorporate the XSTAR procedures 
developed by P. Lambeck of ERIM (See Task 8 for a description of the 
most recent development, a spatially varying haze correction algorithm) . 

Several potential measures of crop development are listed in Table 
4.3. We decided to model the green development profile using selected 
data from Segment 1663 (See Table 4.4) and the mathematical model form 
shown in Figure 4.1. After manual alignment of several sets of obser- 
vations, regression fits of model parameters were made to Tasseled-Cap 
Greenness values (offset by 25 units), resulting in the profiles shown 
in Figure 4.2 for spring wheat and barley. Note that barley tended to 
be greener at the peak and to decline faster in Greenness late in the 
cycle. 

To estimate crop calendar shifts of individual fields or pixels, a 
cross-correlation calculation was made between offset Greenness values 
and samples from the reference profile for various shifts. The form 
chosen for the cross-correlation function (See Table 4.5) has values 
between 0 and 1 and is independent of any scale factor that may exist 
between the reference and the observations. In addition, one may 
estimate other characteristics of the observed data, such as peak 
Greenness value. 

Figures 4.3 and 4.4 illustrate the effects of crop calendar shifts 
on spring wheat and barley pixels, respectively. A summary of our 
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analysis of these results is presented in Table 4.6. For the remaining 
work reported hereafter, only shifts relative to the spring wheat profile 
were utilized. 

Three alternatives for the use of crop calendar shift calculations 
in small grains discrimination are presented in Table 4.7. We began 
evaluation of these alternatives, with initial emphasis being placed 
on the first two alternatives. The data set employed is described in 
Table 4.8. 

Since the procedure is presumed to operate on pixels that have been 
classified as small grains, we devised a simple test to check for 
acceptable small-grain trajectories (Figure 4.5) in the data set. This 
check tended to eliminate pixels that had apparent boundary effects. 

Applying a discrete version of the procedure, data were divided 
into bins according to their computed crop calendar shifts from the 
spring wheat reference date (Table 4.9). Then, a series of stepwise 
linear discriminant analyses was conducted to determine which variables 
and acquisition dates were most valuable in separating spring wheat from 
barley. As noted in Table 4.10, distance from the green arm, a linear 
combination of Greenness and Brightness (and well correlated with Landsat 
MSS5) , was most frequently preferred, with Greenness and Brightness at 
Acquisitions 3 and 5 (Days 157 and 193) also being frequent high choices 
for additional discrimination variables. 

The discrete procedure was applied to data using Brightness and 
Greenness variables at the two key times and optimized decision rules 
for each category of shift. Classification results are presented in 
Table 4.11. Little difference is seen between results obtained with 
the shift data and without it, except for Segment 1663 which was used to 
determine the reference profile. Further analysis was conducted to determine 
which shift values were associated with the best and worst classification 
performances. An even finer division of shift values was made and 
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classifications performed. The results in Table 4.12 show that highest 
performance was obtained for those pixels with the least amount of shift 
and tended to decline as the amount of shift increased. This we believe 
is due to the fact that the Landsat observations were timed optimally, 
relative, to the average crop calendar for the segment, and large shifts 
corresponded to suboptimal sampling times (See Figure 4.6). 

To explore the possible advantages of using shift as a continuous 
variable, single-date classification was performed twice for Segment 1663 — 
once using only the four channels for Day 193 and once adding computed 
crop calendar shift as a fifth variable in the decision rule. Results 
are presented in. Table 4.13 for three different aggregations of the 
decisions. On the left, all small grain pixels that were single-class 
according to the JSC ground truth tape are included. In the center, 
those passing the small-grain trajectory test are included. The right 
column which shows much improved performance includes only pixels which 
passed a field-center test based on spatial-spectral "blobbing" of the 
data. In all cases, the addition of shift improved performance, with 
the greatest improvement of edited pixels. 

Additionally, we initiated an analysis of techniques being developed 
and applied by G. Badhwar of JSC. We determined that more detailed 
information about them was required before analyses could proceed. 

4.4 CONCLUSIONS AND RECOMMENDATIONS 

Conclusions and recommendations are presented in Table 4.14. 

4.5 PLANS 

Plans for the next quarter are presented in Table 4.15. 


4-4 


w 

< 

H 


r3 

o 

IS) 

<c 

CQ 





X 








LU 





»- 








X 



> 










1- 



UJ 


o 




CD 







_1 


ex 




1 — 1 




LX 



q; 


cx> 








0 



< 






Q 







cq 


z 




Z 




>- 





1 




< 




1- 



Q 










1—4 



z 


CO 




CO 




X 



< 


LU 








4—4 





CD 




CD 




cei 



. 1- 


z 




1 — 1 




< 



< 


LU 








X 



LU 


ex 




z 




1—1 



X 


LU 




1 — 1 




< 



*5r 


Ul 








> 





U- 




1- 




<c 



CD 


1—4 




z 







Z 


Q 




< 




t- 



•— 1 






1 — 




z 



CH 


_1 




ex 




LU 



Q_ 


< 




0 

0 



0 



oo 


LU 




ex 

LU 



LU 





> 




2 : 

Q 



Cd 



z 


LU 




4—1 

LU 






LU 

CO 

Cti 





LU 



>- 



LU 

LU 





J— 

s: 


to 

CQ 



3= 


CO 




CO 



z 




h“ 

H 

o 


_1 


0 

CO 


0 

0 



UJ 


1—1 


< 


2 : 

4—4 


1—4 

LU 



OQ 

ex 

f— 



ex 




h- 

U 




< 

CO 


O 

< 


ex 


< 

z 



K— 

_J 

4—4 


X 

Q 

1 — 1 

< 


> 

< 



CO 


»- 


ex 

z 

0 

Q 


ex 

X 



•—4 

s: 

< 


c_> 

LU 

CNI 

Z 


LU 

z 



X 

1—1 

H" 



X 

1 

LU 


CO 

UJ 



1 1 1 

CX) 

CO 


CO 

< 


X 


CQ 








4—1 

ex 

cn 

< 


ex 

z 



o 

> 

«=c 




I— 1 

ex 



LU 



o 

z 

Q 


CO 

Q. 




X 

LU 

< 



< 

C>0 


z 

0 

>- 

ex 


< 

CQ 

h— 


CO 

s 

X3 


o 

ex 

< 

0 


X 


< 

- • 

LU 





c_> 

<=} 

ex 

h- 

Q 

CO 

<=» 

CO 

CD 

LU 



»- 


> — 

ex 

z 


< 


UJ 

z 

> 

< 



0 



LU 

> 

X 

LU 

CO 

LU 

< 

CJ 

CO 

CO 

h- 

>- 

LX 

■ S 



1- 

>- 

OC 

X 

•— I 

z 



X 

0 

CD 

Q 

CO 


_J 

LU 


ai 

ex 

X 

a 

X 


LU 

Z 

►—1 

c>o 

< 

LJ_ 

> 

o 

LU 

G> 

LU 

—} 

LU 

CO 

4 — 1 

CO 


' z 

LI. 

Ul 

H- 

H- 

o 

X 

1 

I — 



> 

Q 

«=c 

1— 1 

X 

</) 

1 — 

ca: 

z 

Q 

< 

ex 

ex 

X 

z 


Q 

H- 

•-H 

< 



»— < 

2 : 

0 

0 

< 

I—* 

Oi 



rr: 

Q_ . 

u. 

1 


1—1 

u_ 

LJ_ 

z 

X 

o 

-I 

< 

X 

CD 

• 


o 

• 

• 

• 

• 

I — 
CO 

• 

• 

«a: 

CXi 

q: 

IX 

X 

• 


CD 



UJ 


* 

ex 

UJ 

Q_ 

h- 

o 



z 






0 

4 1 


o 

X 



»— • 



a 



LX 

ex 

u_ 

LU 

1- 



2 : 



0 




<x. 

o 

Q. 

_J 



»— 1 



0 



LU 

» 


CO 

< 



I— 



ex 



CO 


CO 











< 


1- 

1 




1 






PQ 





r) 

CO 

LU 

Qi 


< c/y 
»- < 

< X 
Q O- 


GO 

<c 


4-5 











»- 











0- 











LU 











O 











CD 











< 


Q 

2 










LU 

O 








ce 


ce 

►—4 








o 


LU 

CO 










Q 












O 








>- 


CO 

LU 








ce 


2 

CD 








o 


o 









J— 


C_5 

ce 








o 



o 








LU 


2 

U- 








-> 


O 









< 


1-^ 

to 



UJ 





ce 


»- 

2 



ce: 





1- 


< 

O 



HD 







> 

»— * 



Q 





2 


ce 

1 — 



LU 







LU 

1— 1 



CJ 





< 


CO 

CO 



CD 





ce 



1— 1 


C^4 

QC 





CD 


o 

ID 


Q_ 

CO 




1 



O 




2 




_i 


X 

CD 



U_ 

o 




_i 


o 

<C 


W 

O 

1— 1 




< 


< 



PQ 


V- 




s: 


UJ 



OO 

>— • 




CO 



< 


H 

>— < 

CO 






ce 

CD 



OO 

•—4 

< 



LU 


o 

►— « 



Q_ 

ro 

1- 



_1 

^ — % 

u_ 

1- 



CD 

a 

< 



(3Q 

2 


^-4 ■ 




CJ 

Q 



< 

O 

ce 

ce 



>- 

< 




H 

» — 1 

< 

(_> 



OO 


CO 


. 2 

Q. 

1- 

CD 





Q 

CO 


o 

LU 

< 

2 

a 

LU 



2 

LU 


I— I 

CD 

o 

LU 

2 

_J 



< 

o 


1 — 

CD 

*— • 

_I 

< 

ID 




o 


o 

< 

u_ 

< 


cie 



CO 

ce 


LU 


• — • 

C-D 

CO 




H- 

Q. 


ce 

ce 

CO 


LU 

2 



2 

LU 


ce 

o 

CO 

ce 

ce 

O 



llJ 

ce 


o 

u. 

< 

o 

H) 

»— -« 



S 

Q_ 

2 

o 



ce 

h- 

CO 



CD 


LU 


ie 

CD 

C_D 

< 

►— 1 



LU 

>- 

LU 

LU 

CD 



LU 

u 



CO 

-J 

ce 

N 

LU 


LU 

Ll_ 

LU 




-J 

CD 

< 

2 

CX- 

1- 


CD 



J— 

< 

OO 

m 

C_> 


< 

LU 




CD 

1 — 






CO 

> 



LU 


1 

1 

1 


1 — 1 

O 

-J 



_J 

CD 





1 — 

O 

CL 



LU 

» — ( 





CO 

X 

Q- 



OO 

Q 





LU 

c_> 

"=3: 



• 

• 





• 

• 

• 


4-b 


TASK A 



POTENTIAL MEASURES OF CROP DEVELOPMENT 




<0 





0 





l-H 





1- 





< 

LU 




a; 

_l 












LX. 




— 

^ 

0 


>• 



LH 

cr 


cc 


00 

CO 

Q- 


LU 


GO 

C>0 



CO 



s: 

h— 

CO 

< 



' 

2 

LU 

2 : 




LU 

1- 

*— * 


Q 

s: 

< 


LU 

2 

Q. 

s: 

LJ- 


< 

0 

t— « 

0 




_J 

H 


U- 



LU 

co 

2 

o 



> 

LU 

0 

d: 


LO 

LU 


>— 1 

Q_ 


CO 

(=1 

d: 

1- 


00 

CO 


< 

< 

H- 


s: 

_J 

Q 

1 — 

z 



< 

2 

LU 

LJ 



2 

LU 

CC 

5 ; 



0 


Q- 

Q. 

< 

5 

»— « 

< 

{£. 

0 



CO 

C_> 

LU 

_l 


3 

2 


1- 

LU 

LU 

LU 

o. 

2 

> 

CO 

s 

0 


LU 

« 

a: 

►.4 

q: 


a 

PQ 

Q 

c_> 

1 — 




1 


CO 

z 

LU 

1-.4 

LU 

>- 

LU 


j 

h- 

►— 1 

u 

LU 

cu 


C_> 

< 

a: 

• 



5 

2 

ep 

<c 


<c 


• • • • 


4-7 


TASK A 



FIODELING GREEN DEVELOPMENT PROFILE 
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ALTERNATIVE DECISION PROCEDURES USING RESULTS OF SHIFT CALCULATIONS 
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Values in Parentheses are without Trajectory Test. 
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DISCRETE PROCEDURE (Continued) 
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DISCRETE PROCEDURE - RESULTS 
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DISCRETE PROCEDURE - RESULTS 
BY SHIFT CLASS 
ALL 3 SEGMENTS 
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CLASSIFICATION COMPARISONS - SEGMENT 1663 ALL SPRING WHEAT 
AND BARLEY PIXELS MID-JULY DATA 
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figure 4.1 

SELECTED MODEL FORM 






EFFECTS OF CROP CALENDAR SHIFTS ON 1663 GREENNESS PROFILES FOR SPRING WHEAT 
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EFFECTS OF CROP CALENDAR SHIFTS ON 1663 GREENNESS PROFILES FOR BARLEY 
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yE RIM 

TASK 6 

FORECASTING PRODUCTION OF IJHEAT FROM SATELLITE DATA 

(J. Colwell) 

6.1 OBJECTIVE 

The general objective of this task is to investigate the utility of 
Landsat, meteorological, and ancillary data for forecasting winter wheat 
yield. 

6.2 APPROACH 

During this report period we have concentrated on an analysis designed 
to make recommendations concerning future directions for crop yield fore- 
casting. Initial activity was devoted to discussions between ERIM and JSC 
with respect to the focus and limits of this activity. Subsequent activity 
has involved review of pertinent literature and discussions with persons 
knowledgeable in the area of wheat yield forecasting. 

6.3 TECHNICAL DISCUSSION 

The analysis is proceeding, and documentation of that analysis has 
begun. 

6 . 4 FUTURE PLANS 

It is expected that a written report summarizing recommendations 
concerning future directions for crop yield forecasting will be completed 
during the next quarter. 
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TASK 7 

STUDY OF MULTICROP SPECTRAL SEPARABILITY 
(W. A. Malila, Task Leader)* 


7.1 INTRODUCTION 

This task addresses the extension of large area crop inventory 
technology to other important crops in addition to wheat. The objective 
is to conduct signature studies using currently available data to provide 
insights and identify potential problem areas for investigation when 
LACIE Transition-Year data sets become available. 

7.2 APPROACH 

The approach being followed is outlined in Table 7.1. Major emphasis 
is to be placed on corn and soybeans data available in Landsat data sets 
and from other sources. 

7.3 SUMMARY OF PROGRESS 

During the reporting period, we continued our analysis of Landsat 
data acquired during the CITARS project. In re-examining these data in 
light of understandings and new techniques resulting from LACIE, we carried ' 
out XSTAR preprocessing and Tasselled-Cap transformations of data from the 
Fayette and Livingston data sets. Landsat signatures of corn and soybeans 
were analyzed as a function of time, and clustering and classification 
studies continued. A simple discrimination test between corn and soybeans 
was devised and yielded high (90-95%) correct classification results on 
two CITARS data sets. 

Limited amounts of corn and soybeans data exist in selected LACIE 
Phase 3 Blind Sites on the fringes of the U.S. Corn Belt region. Analysis 
of these data was initiated. 

•k 

J. Heradal, J. More, and E. Crist contributed to the work reported 
herein. 
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Prior field measurements of corn and soybeans spectral reflectance 
were received from Purdue/LARS, but we were unable to begin analysis of 
them during the quarter. 

7.4 DETAILED DISCUSSION OF PROGRESS 

Last quarter, results of spectral clustering of field-center pixels 
of corn and soybeans were described. Multitemporal displays of two 
clusters from each crop are presented in Figure 7.1 (MSS 6 vs. MSS 5). 

The two com clusters represent 93% of all corn pixels analyzed. Soil 
color differences for the two clusters are evident on the first acqui- 
sition date, June 10th; however, the next three dates reveal the fact 
that the vegetation obscures and/or shadows most of the soil and conse- 
quently dominates the signature. Similar effects are present for the two 
soybeans clusters shown. As noted last quartet, soybeans clusters were 
more numerous and varied than corn clusters — these two largest accounting 
for only 37% of the total pixels. A final observation made from this 
figure is that the corn clusters drop in MSS 6 values from 17 July to 
21 August, while soybean values continue to increase; similar trends were 
present in Tasseled-Cap Greenness values. The decrease for corn appar- 
ently is linked to tasseling. 

Based on observations of distinctive features in the time profiles 
of Greenness for corn and soybeans (see Table 7.2), a simple test was. 
formulated to distinguish between these crops. As defined in Table 7.3, 
a ratio of Greenness on two dates is computed and compared to a threshold 
value. The threshold level for ground-visited* fields in the Fayette 
data was established by histograraming the ratio for the collection of 
field means, noting a bimodality in the data, and selecting a threshold 
value (1.2) which separated the modes (see Figure 7.2). Results are pre- 
sented in Tables 7.3 and 7.4 — an average conditional correct classification 


CITARS ground "truth" had been obtained in twp ways, one was by ground 
visitation and the other by photointerpretation, of multidate aerial color 
photography. 
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of 93-95%. Comparable results were obtained for Livingston field means 
(Table 7.5), and the same threshold value applied (Figure 7.2). 

Since field means are less noisy than individual pixel values, we 
decided to apply the test to Fayette pixel data as well. The ratio 
histogram in Figure 7.3 exhibits the same type of bimodality that was 
observed for field means. Results of the test on pixels (Tables 7.6 
and 7.7) again show a high level of performance — 93% combined correct 
classification. 

Our observations and conclusions are summarized in Table 7.8. The 
test developed is not intended to be a final procedure. Rather, it 
demonstrates that spectral differences are present and consistent in 
at least this limited data set and should serve as a basis for discrimi- 
nation procedures in the future. 

7..5_ PLANS 

Plans for the next quarter are presented in Table 7.9. 
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FIGURE 7.1 

FlULTITEMPORAL TRAJECTORIES OF MAJOR CORN AND SOYBEAN CLUSTERS 
FAYETTE FIELD-CENTER PIXELS CITARS DATA 



Notes: 93% of corn pixels were in the two clusters shown. 

Soybean clusters were more diverse; the above two represent 37% of total. 





FIGURE 7 . 3 

HISTOGRAM OF R RATIO SHOWING BIMODALITY 



Fayette Pixels 
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DISTINCTIVENESS OF TINE PROFILES OF GREENNESS FOR CORN AND SOYBEANS 
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Therefore^ a Simple Test was Formulated to Distinguish Between Corn 
AND Soybeans (Based on Ratio of Greenness Values on Two Dates) 



SIMPLE TEST FOR CORN VS. SOYBEANS 
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TABLE 7,4 
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LASSIFICATIOM 




RESULTS FOR LIVINGSTON DATA 
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COMBINED RESULTS ON FAYETTE PIXELS (GROUND-VISITED AND PI) 
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BREAKDOWN OF RESULTS FOR PIXELS-FAYETTE 
GROUND-VISITED PIXEL SET 
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Distinctive Characteristics of 1'Iultitemporal Signatures of 
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Continue Analysis of Signature Characteristics^ Trajectory 
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TASK 8 

^^ULTICROP LABELING AIDS 
(R. C. Cicone and R. Balon) 


8 . 1 INTRODUCTION 

The accuracy of operational large area crop inventory systems 
modeled after LACIE depends critically on the correctness of crop labels 
generated by Analyst Interpreters. Task 8 is being conducted as a partial 
response to a request from the multicrop inventory planning committee 
for support from the SR&T community in adapting LACIE technology to a 
multicrop environment. The critical issue addressed is that of analyst 
labeling. 

8.2 OBJECTIVES 

Our purpose through Task 8 is to analyze the methods of presenta- 
tion of Landsat data for purposes of human interpretation to assess how 
well they convey the information relevant to crop discrimination. We 
endeavor to develop new data presentation techniques in the form of false 
color image products as well as graphic displays of spectral information, 
which stand to expedite correct labeling of crops. 

8.3 APPROACH 

Points of approach for this task are presented in Table 8.1. We 
have been progressing along a path with the following three major stages: 

1. Implement a model of the color production characteristics of 
the system used to produce the film products, taking account 
of the physical considerations of film production and the 
psychological/perceptual considerations of the fact that the 
film is interpreted through human visual processes. 

2. Analytic evaluation of LACIE film products with regard to 
inform.ation content and between scene stability of color meaning. 
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3. Development of a frame work for new image products which take 
advantage of the knowledge which has been gained about color 
production. 

In another branch of approach we have been working toward the preparation 
of color keys for corn and soybean crop cycles in order to assist the 
Analyst Interpreters in the move to multicrop inventory. 

8.4 SUMMARY OF PROGRESS IN PREVIOUS QUARTERS 

Table 8.2 describes our progress to date. A model of the color 
production of the Production Film Converter after Juday of NASA/JSC has 
been implemented. The model features representation of colors in the 
coordinates of a space designed to be uniform with respect to human 
perception of color differences. Figure 8.1 shows the volume of color 
space which the PFC can represent. Table 8.3 describes this space and 
Table 8.4 diagrams the mapping from Landsat channel signals to color space 

Analysis of LACIE image products with regard to sensitivity and 
information content has been completed. It has been found that non-linear 
characteristics of the PFC and of color vision yield, under existing 
color mappings, a distorted view of Landsat data to the eye. That is to 
say the amount of color expansion afforded to the data is not uniform 
throughout the range of the data. The following experiment serves to 
illustrate this point. A regular sample of points is taken from the plane 
of data concentration for an acquisition (the Kauth Brightness/Greenness 
Plane) and mapped to color space coordinates. A spherical distribution 
is constructed about each of these color space points. Figure 2 shows 
this schematically. The distribution is then transformed by the inverse 
mapping from color to data space. A display of these distributions in 
data space show directly how the data is broken up into classes by color 
in the image. Figure 3 presents a particular example of this display 
technique. Each ellipse boundary represents one distinguishable color 
or class on an image. VJhere the ellipses overlap, the data is not being 
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"resolved" to the fineness of the sample grid. The spacing of the 
sample grid is based on liberal estimate of 3 counts of noise in Landsat 
data. Where ellipses in the sensitivity plot (Figure 3) do not overlap, 
the image is sensitive to less than a 3 count difference between data 
points. Figure A overlays Figure 3 to show the distribution of data to 
which the sensitivity ellipse plot applied. Note that in the area of 
data concentration there is undersirable overlapping of ellipses (image 
classes). From examining acquisitions of a number of sample segments in 
this manner we conclude that resolution of the data into color classes 
is generally less than desireable whenever the data fills out the full 
range of an agricultural scene. Elognated ellipses indicate that distance 
relationships in the data are not preserved by the image, i.e., the image 
presents a distorted view of the data to the eye. The pattern of distor- 
tion as it appears in Figure 3 is similar from one acquisition to another. 
Further explanation of the sensitivity study and the color production 
model will be found in an interim technical memorandum to be released 
next quarter. 

An analysis of an image in terms of classification accuracy was 
performed to test retention of information from the original A channel 
of Landsat data to the false color image. Table 8.5 shows classification 
accuracy of wheat vs. non-wheat at steps along the data-to-perception 
conversion process. Significant loss of information occurs in dropping 
a channel of information and because of the less than desirable level 
of data to color expansion (resolution) . Overall wheat recognition drops 
from 97.2% to 81.8% in the complete process. These experimental results 
support the conclusion drawn from sensitivity analysis, namely that 
currently used image products do not display the full information content 
of the data. 

Work last quarter centered on assessment of image color consistency 
between scenes for LACIE image products and on establishing a framework 
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for definition of Uniform Color Space image products- These items are 
discussed in the next section. 

8.5 DESCRIPTION OF PROGRESS IN THIRD QUARTER 

LACIE Film products are generated from Landsat channels which are 
prepocessed (scaled and biased) in order to improve the contrast present 
in the image produced. The parameters of the transformation for each 
channel are calculated anew for each Landsat scene based on the mean 
and standard deviation of data points for each channel. This initial, 
inconsistent manipulation of the data has inherent dangers to it. The 
upshot is that a point of data space will not be mapped to one color 
consistently. Instead, the color of a data point varies depending on 
the statistics of the scene. This creates problems in interpretation 
of imagery. Color inconsistency has in some instances led to unaccept- 
ables discrepancies betvi^een interpretation and ground truth. Alternative 
film products have been proposed with the idea of reducing color inconsis- 
tency betijeen scenes. One of these alternatives, the "Kraus Product", 
has been adopted as a supplement to the original product, "Product One". 
However, the alternative product is essentially very similar to product 
one in that it also applies a preprocessing to the data which depends 
on scene statistics. An experiment was performed to assess the between 
scene color consistency of Kraus Product imagery. The Kraus Product 
calculates a preprocessing scale factor for each channel based on a 
parameter ij which measures the average brightness of the scene across 
three channels. To evaluate the consistency of these product, we allow 
the scene mean to vary throughout the normal range of agricultural data, 
which is represented by the Tasselled Cap of the Kauth Brightness/Greenness 
plane. For each of these hypothetical scene means we calculate the 
color of a fixed reference point (Figure 8.5). This simulates the appear- 
ance of the reference point in wide range of scenes. The color will 
change and the degree of change is expressed by the distance in Uniform 
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Color Space units between the colors. Five UCS units is a just notice- 
able difference, fifteen units is an easily distinguishable difference 
and fifty units is a highly contrasting difference between colors. As 
our reference color we use the color of our reference data point when 
the scene mean coincides with it. Figure 8.6 portrays the color dif- 
ference strata which result from this experiment. As one would expect, 
the stratification occurs primarily in the Brightness direction. This 
is because the Kraus Product uses an average scene brightness as the 
sole parameter in preprocessing the data. The important point to be 
taken from this figure is that large color inconsistency occurs in Kraus 
imagery when average Brightness is different from one scene to another. 

For example, as the scene mean moves away from the reference point by 
dropping a few counts in Brightness (moving to the left in Figure 6) the 
color of the reference point changes by 12 UCS units, a visually discern- 
ible difference. If the scene mean is moved downward in Brightness 
about 15 counts the resulting color change in the reference point comes 
to 53 UCS units. The original color and the new color may be characterized 
as highly contrasting. The color on a Kraus Product image is insensitive 
to the scene average Greenness coordinate for a constant Brightness. 
However, the color representation of a particular data point varies 
dramatically with the scene average Brightness. The color attributes 
examined included hue, saturation and lightness. A similar pattern, as 
displayed in Figure 8.6, was observed in restricting the analysis to 
chromatic attributes only, i.e., hue and saturation. Considering the 
fact that external effects such as light haze level and sun angle are 
responsible for significant variation in average scene Brightness, one 
may commonly find inconsistent color representation in the Kraus 
Product. 

During this quarter, preliminary work has been done on design of 
new image products. The justification for this work is summarized in 
Table 8.6. The new products take advantage of the knoxm structure of 
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Landsat data and the know structure of the PFC's color space to improve 
upon the shortcomings of information distortion and color inconsistency 
inherent in LACIE film products. The XSTAR spatially varying haze 
correction algorithm will be used to standardize representation of Landsat 
data in Brightness, Greenness and Yello\<rness coordinates. The stand- 
ardized Brightness/Greenness plane can then be mapped by linear transforma- 
tion onto a chosen plane of Uniform color space. Choice of the color space 
plane to use will be arrived at by experimentation and interaction with 
Analyst-Interpreters. Figure 8.7 schematizes one possibility. 

8,6 PLANS 

Our plan for this quarter is to experiment with Uniform Color Space 
image product design and produce examples of imagery for inspection. 

We also plan to produce color keys for currently used products to aid 
Analyst- Interpreters in the move to multicrop inventory. Reports to be 
released include: 

1. Uniform Color Space Analysis of LACIE Image Products 

2. AI Color Key for Multicrop Inventory 

3. Final Report 
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JUSTIFICATION FOR UNIFORM COLOR MAPPING 
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BOUNDARY OF COLORS ATTAINABLE BY THE PRODUCTION 
FILM CONVERTER IN L^A^B* UNIFORM COLOR SPACE 
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SEGMENT 1154, BIO 2 
PRODUCT ONE, CH 4 DOUBLED 
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FIGURE 8.3 l\ TYPICAL SENSITIVITY ELLIPSE PLOT 



FIGURE 8.4 

DATA DISTRIBUTION CORRESPONDING TO THE SENSITIVITY 

ELLIPS'E PLOT 
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FIGURE 8-6 

COLOR CONSISTENCY IN KRAUS PRODUCT 
(16 Level Quantization Of PFC) 
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TASK 8b 

MULTICROP LABELING AIDS 
(P. F. Lambeck) 


8b. 1 INTRODUCTION 

The analysts' ability to generate crop labels, the machine's ability 
to. classify data correctly, and the researchers' ability to devise 
improved data analysis procedures all depend on the quality and consistency 
of the data which is analyzed and processed. During the second quarter 
we identified localized atmospheric variations within the boundaries of 
LACIE segments as a serious detriment to continued progress in developing 
labeling aids for analysts and in developing other new or improved analysis 
techniques. Since these localized atmospheric variations were not 
addressed by the XSTAR algorithm as implemented at that time, and since a 
viable means to remedy this shortcoming had already been proposed, a 
portion of this quarter's effort was focussed on this crucial problem. 

8b. 2 OBJECTIVE 

The objective of this effort was to develop a spatially varying 
XSTAR haze correction procedure to compensate Landsat data for localized 
atmospheric distortions so that more uniform observation conditions 
could be simulated. The key issues addressed by this effort are listed 
in Table 8b. 1. 

8b. 3 APPROACH . 

. The approach that was taken to developing the spatially varying ' 

XSTAR haze correction is outlined in Table 8b. 2. The first two items 
listed in the table are the obvious ones, however the last two items also 
had to be considered in order to arrive at a procedure which was 
sufficiently general and practical. The manner in which the spatially 
varying XSTAR correction was made to fit these latter two constraints is 
discussed in section 8b. 4. 
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8b. 4 PROCEDURE 


The previous "global" or "spatially invariant" XSTAR procedure has 
been defined in Reference 8b. 1. The spatially varying procedure which 
was developed during this quarter is outlined in Table 8b. 3. 


The first step of the spatially varying XSTAR correction is to 
recalibrate and screen the data as in Reference 8b. 1 but with two of 
the SCREEN threshholds made less stringent by shifting them in the 
negative Tasselled Cap yellow direction. Specifically, the second 
equation under step 3b and the second equation under step 3c are 

modified as follows: 


z- + z^/10. < z . + z , . 

3 1 Cmin Cbias 


2 + Z./7. < z,, . + z„,. 

3 1 Hmin Cbias 


^^Cmin 

-7.5) 

(8b. 1) 

^^Hmin 

-3.25) 

(8b. 2) 


(For the examples shown later in this report we have used 2 „, . = -1.^ 

U D X3 S 

however subsequent testing indicates that = -1.7 is perhaps the 

optimum value to use.) 

For the global XSTAR correction, the SCREEN threshholds and 

z„ . were used to edit out dense haze or cloud areas which were atypical 
Hmin 

of normal scene conditions and which were thus more likely to represent 
localized atmospheric variations. However, some of this variation which 
was screened out was potentially correctable by a spatially varying 
XSTAR procedure. Hence, for the new procedure these threshholds have 
been relaxed somewhat . (The threshholds separate dense haze and clouds 
from bright fields.) Relaxing these threshholds also allows the 
spatially varying XSTAR correction to see more of the spatial 
variations and to track them more accurately. 

The second step of the spatially varying XSTAR procedure is to 
divide the scene into 5 line by 5 pixel blocks, and to calculate a 
floating point mean value for each block, using only signal values 
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from pixels within the blocks which pass the SCREEN procedure (i.e., 
"good" pixels) . Mean values for blocks with no good pixels or with 
fewer good pixels than half the average number of good pixels per block 
(truncated to integer foirm) are not used. For these "unknown" blocks, 
mean values are estimated by interpolating or extrapolating from 
neighboring block mean values, as described in steps 3 and 4, below. 

The result of the above two steps is indicated for an example scene 
by the block code map on the left in Figure 8b. 1. In this map, numeric 
values indicate blocks with "known" mean values at this stage of 
processing. The numbers are the last digit of the average Tasselled 
Cap yellow value, after rounding to an integer form such that the 
average yellow value for the whole scene (using only good pixels) ends 
with the digit "5". The digit "6" thus represents blocks with an 
average yellow value in the range -11.9 ± .5, while the digit "4" 
represents blocks with an average yellow value in the range -13.9 ± .5. 
In general, less "yellow" indicates more haze. The characters "C", "H", 
and "W" represent blocks with unknown mean values at this stage of 
processing whose "non-good" pixels are mostly "cloud", "dense haze" or 
"water", respectively. The character "?" is used for blocks with all 
good pixels, but with an insufficient number of good pixels because the 
blocks overflow the boundaries of the scene. 

In the third step of the spatially varying XSTAR procedure, the 
mean values of the 5 line by 5 pixel blocks are smoothed, using a 
non-recursive moving window filter. The filter presently used (and 
recommended) has a window size of 5 blocks by 5 blocks, with a three 
block distance between half amplitude points in both the along-track 
and across-track directions. The weighting factors used in this filter 
are given in Table 8b. 4. These weighting factors correspond to along- 
track and across-track cross-sections of the filter weighting function 
which are gaussian from the peak of the half amplitude points and which 
are mirror-image gaussian from the half amplitude points to zero. 
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In this third step, smoothed mean values are calculated for all blocks 
with "known" mean values, and for all blocks with at least one near 
neighbor (either along-track or across-track) which has a "known" 
mean value. The smoothed mean value is the weighted average of the 
available "known" mean values within the window of the filter. 

The result of this third step is indicated for the example scene 
by the block code map on the right in Figure 8b. 1. In this map zeros 
indicate blocks with smoothed mean Tasselled Cap yellow values within 
± .5 counts of the average for the whole scene. Minus signs correspond 
to less yellow smoothed mean values (indicating more haze) and plus 
signs correspond to more yellow smoothed mean values (indicating less 
haze). Other characters indicate blocks which still have "unknown" 
mean values. These are addressed in step 4. 

The combination of steps 2 and 3 above is equivalent to passing a 
25 line by 25 pixel moving window filter over the scene, and then 
sampling every 5th line and every 5th point to obtain the smoothed 
block mean values. The along-track or across-track cross-section of 
this equivalent filter weighting function, and its frequency response 
are plotted in Figure 8b. 2. The distance between the half amplitude 
points of this equivalent filter is 15 pixels. The procedure described 
in steps 2 and 3 is 'considerably less expensive than applying the 
equivalent filter difectly to the scene. The lumps in the frequency 
response for this procedure, visible to either side of .2 and .4 
cycles per pixel, are caused by the block averaging procedure in step 2 
If we are to economize through block averaging, they cannot be avoided. 
Using a smaller block size would shift the lumps toward higher 
frequencies and would decrease their amplitude slightly. A block size 
of 1 line by 1 pixel would shift the lumps off the end of the frequency 
scale (beyond one half the sampling rate of Landsat) . 

Although these lumps pass frequencies which may cause artifacts in 
the corrected data, their amplitudes seem to be small enough so that 
these artifacts would not be significant. 
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Step 4 of the spatially varying XSTAR procedure is used to assign 
smoothed mean values to blocks which still have "unknown" mean values 
after step 3. In this step only those blocks which have "unknown" 
mean values, but which have at least one near neighbor (either along- 
track or across-track) with a smoothed mean value, are assigned smoothed 
mean values according to the procedure of step 3. Step 4 is iterated 
until all blocks have smoothed mean values. 

The end results of step 4 for the example scene, and for the same 
scene on the following day, are Indicated in Figure 8b. 3. Note the 
dissimilarity of the block code maps for the two days shown in the 
figure. This indicates that there were significant spatial variations 
in the atmospheric condition on at least one of the two days (in this 
case, mostly on the first day) . The spatial variations in the block 
average yellow values on day 140 (the second day, which was relatively 
clear) are not necessarily due to atmospheric variations. Such spatial 
variations can also be caused by the composition of the scene. In 
particular, soil color, moisture, canopy density, canopy yellowing, 
and land use (e.g., urban areas) are suspected to influence the 
Tasselled Cap yellow component in minor amounts. We do not know to 
what extent these yellow variations are detrimental to the performance 
of XSTAR in crop survey applications. An evaluation of this question 
would have to be the subject of a future research effort, however we 
believe that the benefits of the spatially varying XSTAR correction 
will significantly outweigh the detriments. 

In step 5 of the spatially varying XSTAR procedure, the multiplicative 
and additive correction factors appropriate for each block mean value 
are calculated. This is done as specified in Reference 8b. 1, with 
each block mean used as the value for x (steps 5 and 6 in the reference) . 

Step 6 is the final step in the spatially varying XSTAR procedure. 

In this step the multiplicative and additive correction factors 
calculated for each block mean in step 5 are interpolated between block 
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centers (in two dimensions) to determine the appropriate correction 
factors for each pixel. These correction factors are then applied 
pixel by pixel. . Pixels which are near the borders of the scene, so 
that only one or two block centers are within the interpolating 
range (±4 lines and ±4 pixels) of the pixel, are corrected by inter- 
polating the correction factors calculated for those blocks which have 
centers within the interpolating range. 

The interpolation used in step 6, above, is not a linear 
interpolation (although a procedure which interpolated linearly in the 
along-track and across-track directions could have been used). Instead, 
an Interpolating filter is used. The weighting factors for this filter 
are specified in Table 8b. 5. The weighting factors correspond to 
along-track and across-track cross-sections of the filter weighting 
function which are gaussian from the peak to the half amplitude points 
and which are mirror-image gaussian from the half amplitude points to 
zero (generally similar to the smoothing filter which was applied to 
the block mean values) . The distance between the half amplitude 
points of this filter is 5 pixels in both the along-track and across- 
track directions. Note that the sum of those weighting factors which 
are located on a 5 line by 5 pixel grid is always unity (for proper 
steady-state response of the filter). When the filter is applied, the 
multiplicative and additive correction factors calculated for each block 
are associated with the pixel at the center of the block, and zeros are 
associated with all other pixels. Applying the filter (non-recursively) 
to the correction factors associated with the pixels (by the above 
procedure) produces the Interpolated result. For pixels near the 
borders of the scene, correction factors which echo the pattern of the 
block mean correction factors along the border are dummied in for 
pixels outside the border as required by the filter. 

The along-track or across-track cross-section of this interpolating 
filter weighting function, and its frequency response are plotted in 
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Figure 8b. 4. The desired characteristics of this filter are that it 
have a relatively flat frequency response from zero to 1/15 cycles per 
pixel and have relatively little response above 2/15 cycles per pixel, 
and that it require no more than 9x9 weighting factors. The filter 
seems to suit these requirements quite well. 

8b. 5 COST/BENEFIT ANALYSIS 

The block size for calculating the block averages and the dimensions 
of the moving window filter used to smooth the block mean values were 
determined from a cost/benefit analysis of the spatially varying XSTAR 
procedure. In this analysis, cost and performance of the new procedure 
were compared to the former spatially invaraint (global) XSTAR 
procedure. Figure 8b. 5 indicates typical reductions in the RMS error in 
matching signal values (pixel by pixel) in consecutive day data which 
has spatial haze variations. In general we found that the smaller 
we made the effective size of the moving window smoothing filter, the 
better we could do. However, the cost of the correction increased with 
decreasing window size. This cost was strongly related to the block 
size used for calculating block averages, and varied as shown in Figure 

8b. 6. We finally chose to use a 5 line by 5 pixel block size and a 

moving window filter which, in combination with the block size, produced 
the equivalent of a 15 line by 15 pixel moving window (measured between 
half amplitude points) . 

The performance of the spatially varying XSTAR correction is 
compared to the former global XSTAR correction and to no transformation 
(but with sun angle correction) for three data sets in Table 8b. 6. One 
of these data sets is segment 1640 which was also presented in Figures 
8b. 1 and 8b. 3. All three scenes had significant spatial haze variations 
The lower bound on the RMS error which is achievable by an ideal 
correction is not known, but could be limited by such factors as scene 
misregistration, view angle effects, and round-off error following the 
correction. 
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Previous test results have indicated that this lower bound may be 
between 3 and 6 counts, depending on the scene being processed. We 
believe that the improvement we have achieved is significant and that 
the spatially varying XSTAR correction should eventually become a 
part of the standard LACIE procedure. 
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FIGURE 8b. 2 

LOW-PASS MOVING WINDOW SPATIAL FILTER 
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FIGURE 8b. 3 

SPATIALLY VARYIfJS XSTAR HAZE CORRECTION 
(Seg # 1640) 

Spatially Smoothed and Interpolated Block Codi 
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i FIGURE 8b. 4 

INTERPOLATING FILTER FOR APPLYING XSTAR CORRECTION TO EACH PIXEL 
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FIGURE 8b. 5 

PERFORMANCE OF SPATIALLY VARYING XSTAR HAZE CORRECTION 
ON FOUR CONSECUTIVE DAY DATA SETS WITH VARYING HAZE CONDITIONS 
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FIGURE 8b. 6 

COST OF SPATIALLY VARYING XSTAR HAZE CORRECTION 
AS FUNCTION OF BLOCK SIZE USED 
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